detection and diagnosis
Mispronunciation Detection and Diagnosis Without Model Training: A Retrieval-Based Approach
Tu, Huu Tuong, Khanh, Ha Viet, Dat, Tran Tien, Huan, Vu, Van Luong, Thien, Cuong, Nguyen Tien, Trang, Nguyen Thi Thu
ABSTRACT Mispronunciation Detection and Diagnosis (MDD) is crucial for language learning and speech therapy. Unlike conventional methods that require scoring models or training phoneme-level models, we propose a novel training-free framework that leverages retrieval techniques with a pre-trained Automatic Speech Recognition model. Our method avoids phoneme-specific modeling or additional task-specific training, while still achieving accurate detection and diagnosis of pronunciation errors. Experiments on the L2-ARCTIC dataset show that our method achieves a superior F1 score of 69.60% while avoiding the complexity of model training. Index T erms-- Mispronunciation detection and diagnosis, retrieval-based methods, training-free framework, automatic pronunciation assessment 1. INTRODUCTION Mispronunciation Detection and Diagnosis is a fundamental task in Computer-Assisted Pronunciation Training (CAPT).
Reproducibility of Machine Learning-Based Fault Detection and Diagnosis for HVAC Systems in Buildings: An Empirical Study
Mukhtar, Adil, Hadwiger, Michael, Wotawa, Franz, Schweiger, Gerald
Reproducibility is a cornerstone of scientific research, enabling independent verification and validation of empirical findings. The topic gained prominence in fields such as psychology and medicine, where concerns about non - replicable results sparked ongoing discussions about research practices. In recent years, the fast-growing field of Machine Learning (ML) has become part of this discourse, as it faces similar concerns about transparency and reliability. Some reproducibility issues in ML research are shared with other fields, such as limited access to data and missing methodological details. In addition, ML introduces specific challenges, including inherent nondeterminism and computational constraints. While reproducibility issues are increasingly recognized by the ML community and its major conferences, less is known about how these challenges manifest in applied disciplines. This paper contributes to closing this gap by analyzing the transparency and reproducibility standards of ML applications in building energy systems. The results indicate that nearly all articles are not reproducible due to insufficient disclosure across key dimensions of reproducibility. 72% of the articles do not specify whether the dataset used is public, proprietary, or commercially available. Only two papers share a link to their code - one of which was broken. Two-thirds of the publications were authored exclusively by academic researchers, yet no significant differences in reproducibility were observed compared to publications with industry-affiliated authors. These findings highlight the need for targeted interventions, including reproducibility guidelines, training for researchers, and policies by journals and conferences that promote transparency and reproducibility.
Segmentation-free Goodness of Pronunciation
Cao, Xinwei, Fan, Zijian, Svendsen, Torbjørn, Salvi, Giampiero
Mispronunciation detection and diagnosis (MDD) is a significant part in modern computer aided language learning (CALL) systems. Within MDD, phoneme-level pronunciation assessment is key to helping L2 learners improve their pronunciation. However, most systems are based on a form of goodness of pronunciation (GOP) which requires pre-segmentation of speech into phonetic units. This limits the accuracy of these methods and the possibility to use modern CTC-based acoustic models for their evaluation. In this study, we first propose self-alignment GOP (GOP-SA) that enables the use of CTC-trained ASR models for MDD. Next, we define a more general alignment-free method that takes all possible alignments of the target phoneme into account (GOP-AF). We give a theoretical account of our definition of GOP-AF, an implementation that solves potential numerical issues as well as a proper normalization which makes the method applicable with acoustic models with different peakiness over time. We provide extensive experimental results on the CMU Kids and Speechocean762 datasets comparing the different definitions of our methods, estimating the dependency of GOP-AF on the peakiness of the acoustic models and on the amount of context around the target phoneme. Finally, we compare our methods with recent studies over the Speechocean762 data showing that the feature vectors derived from the proposed method achieve state-of-the-art results on phoneme-level pronunciation assessment.
Fault detection and diagnosis for the engine electrical system of a space launcher based on a temporal convolutional autoencoder and calibrated classifiers
Basora, Luis, Bocquet-Nouaille, Louison, Robinson, Elinirina, Gonidec, Serge Le
In the context of the health monitoring for the next generation of reusable space launchers, we outline a first step toward developing an onboard fault detection and diagnostic capability for the electrical system that controls the engine valves. Unlike existing approaches in the literature, our solution is designed to meet a broader range of key requirements. This includes estimating confidence levels for predictions, detecting out-of-distribution (OOD) cases, and controlling false alarms. The proposed solution is based on a temporal convolutional autoencoder to automatically extract low-dimensional features from raw sensor data. Fault detection and diagnosis are respectively carried out using a binary and a multiclass classifier trained on the autoencoder latent and residual spaces. The classifiers are histogram-based gradient boosting models calibrated to output probabilities that can be interpreted as confidence levels. A relatively simple technique, based on inductive conformal anomaly detection, is used to identify OOD data. We leverage other simple yet effective techniques, such as cumulative sum control chart (CUSUM) to limit the false alarms, and threshold moving to address class imbalance in fault detection. The proposed framework is highly configurable and has been evaluated on simulated data, covering both nominal and anomalous operational scenarios. The results indicate that our solution is a promising first step, though testing with real data will be necessary to ensure that it achieves the required maturity level for operational use.
Dynamic System Model Generation for Online Fault Detection and Diagnosis of Robotic Systems
Kohl, Johannes, Muck, Georg, Jäger, Georg, Zug, Sebastian
With the rapid development of more complex robots, Fault Detection and Diagnosis (FDD) becomes increasingly harder. Especially the need for predetermined models and historic data is problematic because they do not encompass the dynamic and fast-changing nature of such systems. To this end, we propose a concept that actively generates a dynamic system model at runtime and utilizes it to locate root causes. The goal is to be applicable to all kinds of robotic systems that share a similar software design. Additionally, it should exhibit minimal overhead and enhance independence from expert attention.
FaultExplainer: Leveraging Large Language Models for Interpretable Fault Detection and Diagnosis
Khan, Abdullah, Nahar, Rahul, Chen, Hao, Flores, Gonzalo E. Constante, Li, Can
Machine learning algorithms are increasingly being applied to fault detection and diagnosis (FDD) in chemical processes. However, existing data-driven FDD platforms often lack interpretability for process operators and struggle to identify root causes of previously unseen faults. This paper presents FaultExplainer, an interactive tool designed to improve fault detection, diagnosis, and explanation in the Tennessee Eastman Process (TEP). FaultExplainer integrates real-time sensor data visualization, Principal Component Analysis (PCA)-based fault detection, and identification of top contributing variables within an interactive user interface powered by large language models (LLMs). We evaluate the LLMs' reasoning capabilities in two scenarios: one where historical root causes are provided, and one where they are not to mimic the challenge of previously unseen faults. Experimental results using GPT-4o and o1-preview models demonstrate the system's strengths in generating plausible and actionable explanations, while also highlighting its limitations, including reliance on PCA-selected features and occasional hallucinations.
Machine Learning for Complex Systems with Abnormal Pattern by Exception Maximization Outlier Detection Method
Zhang, Zhikun, Duan, Yiting, Wang, Xiangjun, Zhang, Mingyuan
This paper proposes a novel fast online methodology for outlier detection called the exception maximization outlier detection method(EMODM), which employs probabilistic models and statistical algorithms to detect abnormal patterns from the outputs of complex systems. The EMODM is based on a two-state Gaussian mixture model and demonstrates strong performance in probability anomaly detection working on real-time raw data rather than using special prior distribution information. We confirm this using the synthetic data from two numerical cases. For the real-world data, we have detected the short circuit pattern of the circuit system using EMODM by the current and voltage output of a three-phase inverter. The EMODM also found an abnormal period due to COVID-19 in the insured unemployment data of 53 regions in the United States from 2000 to 2024. The application of EMODM to these two real-life datasets demonstrated the effectiveness and accuracy of our algorithm.
Intelligent Condition Monitoring of Industrial Plants: An Overview of Methodologies and Uncertainty Management Strategies
Ahang, Maryam, Charter, Todd, Ogunfowora, Oluwaseyi, Khadivi, Maziyar, Abbasi, Mostafa, Najjaran, Homayoun
Condition monitoring plays a significant role in the safety and reliability of modern industrial systems. Artificial intelligence (AI) approaches are gaining attention from academia and industry as a growing subject in industrial applications and as a powerful way of identifying faults. This paper provides an overview of intelligent condition monitoring and fault detection and diagnosis methods for industrial plants with a focus on the open-source benchmark Tennessee Eastman Process (TEP). In this survey, the most popular and state-of-the-art deep learning (DL) and machine learning (ML) algorithms for industrial plant condition monitoring, fault detection, and diagnosis are summarized and the advantages and disadvantages of each algorithm are studied. Challenges like imbalanced data, unlabelled samples and how deep learning models can handle them are also covered. Finally, a comparison of the accuracies and specifications of different algorithms utilizing the Tennessee Eastman Process (TEP) is conducted. This research will be beneficial for both researchers who are new to the field and experts, as it covers the literature on condition monitoring and state-of-the-art methods alongside the challenges and possible solutions to them.
Unsupervised Learning for Fault Detection of HVAC Systems: An OPTICS -based Approach for Terminal Air Handling Units
Rajabi, Farivar, McArthur, J. J.
The rise of AI-powered classification techniques has ushered in a new era for data-driven Fault Detection and Diagnosis in smart building systems. While extensive research has championed supervised FDD approaches, the real-world application of unsupervised methods remains limited. Among these, cluster analysis stands out for its potential with Building Management System data. This study introduces an unsupervised learning strategy to detect faults in terminal air handling units and their associated systems. The methodology involves pre-processing historical sensor data using Principal Component Analysis to streamline dimensions. This is then followed by OPTICS clustering, juxtaposed against k-means for comparison. The effectiveness of the proposed strategy was gauged using several labeled datasets depicting various fault scenarios and real-world building BMS data. Results showed that OPTICS consistently surpassed k-means in accuracy across seasons. Notably, OPTICS offers a unique visualization feature for users called reachability distance, allowing a preview of detected clusters before setting thresholds. Moreover, according to the results, while PCA is beneficial for reducing computational costs and enhancing noise reduction, thereby generally improving the clarity of cluster differentiation in reachability distance. It also has its limitations, particularly in complex fault scenarios. In such cases, PCA's dimensionality reduction may result in the loss of critical information, leading to some clusters being less discernible or entirely undetected. These overlooked clusters could be indicative of underlying faults, and their obscurity represents a significant limitation of PCA when identifying potential fault lines in intricate datasets.
SensorSCAN: Self-Supervised Learning and Deep Clustering for Fault Diagnosis in Chemical Processes
Golyadkin, Maksim, Pozdnyakov, Vitaliy, Zhukov, Leonid, Makarov, Ilya
Modern industrial facilities generate large volumes of raw sensor data during the production process. This data is used to monitor and control the processes and can be analyzed to detect and predict process abnormalities. Typically, the data has to be annotated by experts in order to be used in predictive modeling. However, manual annotation of large amounts of data can be difficult in industrial settings. In this paper, we propose SensorSCAN, a novel method for unsupervised fault detection and diagnosis, designed for industrial chemical process monitoring. We demonstrate our model's performance on two publicly available datasets of the Tennessee Eastman Process with various faults. The results show that our method significantly outperforms existing approaches (+0.2-0.3 TPR for a fixed FPR) and effectively detects most of the process faults without expert annotation. Moreover, we show that the model fine-tuned on a small fraction of labeled data nearly reaches the performance of a SOTA model trained on the full dataset. We also demonstrate that our method is suitable for real-world applications where the number of faults is not known in advance. The code is available at https://github.com/AIRI-Institute/sensorscan.